Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add last resolve time metric #52

Merged
merged 9 commits into from
Jul 9, 2024

Conversation

lalalalatt
Copy link
Contributor

@lalalalatt lalalalatt commented Jul 6, 2024

As the description in authzed/spicedb#422:

It's important for an operator to know that spicedb is continually receiving updated peer lists. Emit metrics that can be used to alert if the list is not updated expediently.

@lalalalatt
Copy link
Contributor Author

@sercand please review this PR, Thanks

builder.go Outdated
@@ -283,6 +296,7 @@ func (k *kResolver) watch() error {
if hasMore {
k.handle(up.Object)
} else {
k.lastResolve.Set(float64(time.Since(k.lastResolveTime).Seconds()))
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@lalalalatt do you achieve what do you want with this change or did your test it? I don't think that lastResolve will update frequently therefore what you will see is "0" on prometheus metrics.

@lalalalatt
Copy link
Contributor Author

lalalalatt commented Jul 8, 2024

@sercand Thanks for the correction.
I misunderstood the interaction between grpc resolver and the kResolver previously.


And now I have added a test for the last update unix metric, here is the full test result:

 ❯ go test . -v
=== RUN   TestBuilder
0, address: 192.168.194.47:53
0, servername: kube-dns.kube-system
 --- PASS: TestBuilder (0.02s)
+ === RUN   TestResolveLag
+ 0, address: 192.168.194.47:53
+ 0, servername: kube-dns.kube-system
+     builder_test.go:93: client resolver lag: 2 s
+ --- PASS: TestResolveLag (2.01s)
=== RUN   TestParseResolverTarget
 --- PASS: TestParseResolverTarget (0.00s)
=== RUN   TestParseTargets
 --- PASS: TestParseTargets (0.00s)
PASS
ok      github.com/sercand/kuberesolver/v5

@lalalalatt lalalalatt requested a review from sercand July 8, 2024 04:41
@lalalalatt
Copy link
Contributor Author

lalalalatt commented Jul 9, 2024

I change the metric to record the latest resolverClient.UpdateState() function call time to prevent the frequently trigger ticker added.
Then the user can set the observation alert by set some threshold on the time.Now() - last update unix value.
Example:

groups:
  - name: update_state_alerts
    rules:
      - alert: UpdateStateStale
        expr: time() - last_update_state_time_seconds > 60
        for: 1m
        labels:
          severity: critical
        annotations:
          summary: "UpdateState has not been called recently"
          description: "It has been more than 60 seconds since the last UpdateState call."

@sercand sercand merged commit b382846 into sercand:master Jul 9, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants